A generalized LSTM-like training algorithm for second-order recurrent neural networks

نویسندگان

Derek Monner

James A. Reggia

چکیده

The long short term memory (LSTM) is a second-order recurrent neural network architecture that excels at storing sequential short-term memories and retrieving them many time-steps later. LSTM's original training algorithm provides the important properties of spatial and temporal locality, which are missing from other training approaches, at the cost of limiting its applicability to a small set of network architectures. Here we introduce the generalized long short-term memory(LSTM-g) training algorithm, which provides LSTM-like locality while being applicable without modification to a much wider range of second-order network architectures. With LSTM-g, all units have an identical set of operating instructions for both activation and learning, subject only to the configuration of their local environment in the network; this is in contrast to the original LSTM training algorithm, where each type of unit has its own activation and training instructions. When applied to LSTM architectures with peephole connections, LSTM-g takes advantage of an additional source of back-propagated error which can enable better performance than the original algorithm. Enabled by the broad architectural applicability of LSTM-g, we demonstrate that training recurrent networks engineered for specific tasks can produce better results than single-layer networks. We conclude that LSTM-g has the potential to both improve the performance and broaden the applicability of spatially and temporally local gradient-based training algorithms for recurrent neural networks.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Kalman filters improve LSTM network performance in problems unsolvable by traditional recurrent nets

The long short-term memory (LSTM) network trained by gradient descent solves difficult problems which traditional recurrent neural networks in general cannot. We have recently observed that the decoupled extended Kalman filter training algorithm allows for even better performance, reducing significantly the number of training steps when compared to the original gradient descent training algorit...

متن کامل

Deep LSTM for Large Vocabulary Continuous Speech Recognition

Recurrent neural networks (RNNs), especially long shortterm memory (LSTM) RNNs, are effective network for sequential task like speech recognition. Deeper LSTM models perform well on large vocabulary continuous speech recognition, because of their impressive learning ability. However, it is more difficult to train a deeper network. We introduce a training framework with layer-wise training and e...

متن کامل

Prototypical Recurrent Unit

The difficulty in analyzing LSTM-like recurrent neural networks lies in the complex structure of the recurrent unit, which induces highly complex nonlinear dynamics. In this paper, we design a new simple recurrent unit, which we call Prototypical Recurrent Unit (PRU). We verify experimentally that PRU performs comparably to LSTM and GRU. This potentially enables PRU to be a prototypical example...

متن کامل

High-Order Graph Convolutional Recurrent Neural Network: A Deep Learning Framework for Network-Scale Traffic Learning and Forecasting

Traffic forecasting is a challenging task, due to the complicated spatial dependencies on roadway networks and the time-varying traffic patterns. To address this challenge, we learn the traffic network as a graph and propose a novel deep learning framework, High-Order Graph Convolutional Long Short-Term Memory Neural Network (HGC-LSTM), to learn the interactions between links in the traffic net...

متن کامل

PAYMA: A Tagged Corpus of Persian Named Entities

The goal in the named entity recognition task is to classify proper nouns of a piece of text into classes such as person, location, and organization. Named entity recognition is an important preprocessing step in many natural language processing tasks such as question-answering and summarization. Although many research studies have been conducted in this area in English and the state-of-the-art...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

Neural networks : the official journal of the International Neural Network Society

دوره 25 1 شماره

صفحات -

تاریخ انتشار 2012

A generalized LSTM-like training algorithm for second-order recurrent neural networks

نویسندگان

چکیده

منابع مشابه

Kalman filters improve LSTM network performance in problems unsolvable by traditional recurrent nets

Deep LSTM for Large Vocabulary Continuous Speech Recognition

Prototypical Recurrent Unit

High-Order Graph Convolutional Recurrent Neural Network: A Deep Learning Framework for Network-Scale Traffic Learning and Forecasting

PAYMA: A Tagged Corpus of Persian Named Entities

عنوان ژورنال:

اشتراک گذاری